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Abstract 

Epidemiological processes are studied within a recently proposed hierarchical network model 
using the susceptible-infected-refractory dynamics of an epidemic. Within the network model, a 
population may be characterized by H independent hierarchies or dimensions, each of which consists 
of groupings of individuals into layers of subgroups. Detailed numerical simulations reveal that for 
H > 1, global spreading results regardless of the degree of homophily of the individuals forming a 
social circle. For H = 1, a. transition from global to local spread occurs as the population becomes 
decomposed into increasingly homophilous groups. Multiple dimensions in classifying individuals 
(nodes) thus make a society (computer network) highly susceptible to large scale outbreaks of 
infectious diseases (viruses). 
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I. INTRODUCTION 



Epidemics of all kinds are costly. The influenza in the early 20th century and the outbreak 
of SARS in 2003 claimed precious lives, and the spreading of viruses on the internet has led 
to much economic loss when computer systems of governments and airlines were attacked. 
Recent breakdowns of major power networks in North America and in Europe also brought 
great loss and inconvenience to millions of people. A less damaging example is, perhaps, 
that of the spread of a rumor in a community. Epidemiological processes, together with 
percolation and problems on searchability, are the most interesting dynamical problems 
in a system of connected entities There are two key factors in the study of 

epidemic spreading, namely how infection and recovery occur and how people or computers 
are connected. The latter is best described in terms of networks or graphs 1^0,01 in which 
the nodes representing people, communities, power plants, or computers are connected by 
links representing the acquaintance among people or the connections between computers. 
The most important question is the extent to which a disease (virus) may propagate. Since 
infections take place through direct contact between infected (I) and susceptible (S) nodes, 
the underlying structure of the network understandably plays a determining role on the 
spread of diseases 0. 

The Susceptible- Infected- Refractory (or recovered/removal) (SIR) model has been 

widely used in different forms for studying epidemiological processes such as the spread of 
infii 
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uenza (see, for example, Ref. [q) and SARS P, llO|, and also for the spread of rumors 
ll^ . In its simplest form, a node takes on one of three states, namely S, I or R. The 



R-state represents the case in which an infected node becomes recovered and immunized or, 
unfortunately dead. For computer networks, the R-state may represent recovery with the 
virus removed and anti-virus software installed. A node in the R-state is not infectious and 
will not become infected again. The underlying connectivity obviously plays a decisive role. 
For nodes connected randomlv, a substantial proportion becomes infected and eventually 



recovered from an epidemic 
into a small- world network 12, |c 



. However, when a regular network is gradually re-wired 



15l | and then a random network, it was found that 



a threshold exists on the degree of disorderness below which the disease can only spread 



locally, but not globally 



11 



Networks have fascinating structural and dynamical properties, and they have wide ap- 
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plications in physical, biological, social, economic, and computer systems 3,0,0, 16. Il7|. 
Physicists have made important contributions to the understanding of networks in recent 
yea rs. Structurally, the small- world networks 14 , 2| and the growing scale-free networks 



18| have been extensively studied . Depending on the model, the degree distribution 

may show power law or exponential forms. As the underlying connectivity plays a decisive 
role in the studies of dynamical processes, a good network model that reflects the struc- 
tural properties is therefore required for modelling epidemics or other dynamical problems 
in a connected population, or in systems in which the entities are grouped in a hierarchical 
fashion as in a population. The hierarchical model of searchable social networks proposed 
recently by Watts et al. j3] captures the essential ingredients of a network model of con- 
nected population and has enjoyed much attention. The model is physically transparent in 
that it is motivated by the observation that individuals within a population are grouped 
according to their function into different dimensions, for example, their occupation, hobby, 
home district, etc. The model exhibits the phenomena of "six degrees of separation", which 



was discovered by Travers and Milgram in the 1960 's j2y| and has been tested in a world-wide 
experiment on the internet 

Recent works on networks on robustness of networks under attacks and ways to 
prevent attacks [23| indicate that, again, the connectivity of a network is important. It is, 
therefore, important to study the dynamical aspects of the successful social network model 
of Watts et al. In the present work, we present detailed results of numerical simulations 
on the extent of a spread of disease or virus in the hierarchical social network model of Watts 
et al. within the SIR epidemic dynamics. It is found that the spread may be local or global 
if the population can be partitioned in a unique way. However, when the partition of the 
population becomes multi-dimensional, the result is different from that of the searchability 
problem. It is found that the spread is always global. 

The plan of the paper is as follows. In Sec. II, we motivate the hierarchical structure and 
describe how links are established between nodes in a hierarchical structure with a tunable 
degree of homophily. In Sec. Ill, we describe the implementation of the SIR dynamics and 
discuss the results on the spread of an epidemic in populations that can be classified into 
one and two hierarchies. Section IV summarizes our findings. 
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II. NETWORK MODEL OF CONNECTED POPULATION 



The model of Watts et al. was motivated by the general structure in the groupings of 
individuals in a society as shown in Fig.l The highest level can be regarded as a 

population of individuals or nodes. These N nodes may then be classified or partitioned 
into h groups. Each group can further be divided into h subgroups and so on. After (/ — 
1) divisions the structure has a total of I levels and ends at a level where an individual 
belongs to a close functional group of size where g typically is of the order of 10^ to 
10^. Individuals belonging to the same lowest-level subgroup have the highest chance of 
becoming most similar or getting acquaintance. For example, all physicists in academia 
can be classified roughly by their research area, say condensed matter physics or nuclear 
physics, and further classified by their subfields, e.g., magnetism or superconductivity, and 
then further grouped by their specific research interests, e.g., topics grouped into the same 
session in a conference or with the same code in the PACS classification scheme. Obviously 
the divisions are usually not unique. Physicists may be grouped geographically based on 
the region their institution is located, and then the country, state, county, department, 
and research groups. A population of individuals or a collection of nodes may thus be 
characterized by H hierarchies or dimensions, each of which takes on the structure shown 
in Fig.l. The structure of the network of nodes is then characterized by the parameters 
if, 6, /, and {g) with = {g)b^'^, where (g) is the average size of the lowest-level functional 
subgroups. 

An important quantity within a hierarchy is the social distance Xij that measures the 
similarity between two nodes i and j. For nodes belonging to the same lowest-level subgroup 
in a given hierarchy, Xij = 1, otherwise Xij is the number of levels from the lowest for which 
the two nodes belong to the same group. Hence the largest separation is / in a given hierarchy. 
For H > 1, an important geometrical feature is that for nodes i and j with Xij = 1 in one 
hierarchy and j and k with Xjk = 1 in another, Xik is in general not small. For example, 
it is unlikely that the colleague in the office next door knows your collaborator in another 
country. Within the context of epidemics in a community, the hierarchies may be regarded 
as family, relatives, friends, and so on; or neighbors in the same apartment building, same 
neighborhood, same district and so on. While the model was motivated by the structure in 
groupings within a society, it is obvious that many well-structured systems, e.g. computer 
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clusters, are connected in a similar fashion. 

Computationally, a social network with a tunable degree of homophily can be constructed 
as follows 2- For a population of individuals with social structure as shown in Fig.l, the 
individuals are first assigned randomly into the lowest subgroups with an average size {g) in 
each of the H hierarchies. Links between individuals are established as follows. For H > 1, 
a hierarchy is randomly chosen. Two nodes i and j are then selected randomly within the 
chosen hierarchy. A link, specifying that i and j are friends and hence become acquaintance, 
is estabhshed with a probability P{xij) = exp (—axij)/ J2n=i ^^P i—o^n) depending on Xij of 
the chosen nodes in the chosen hierarchy. No duplicated links between i and j are allowed. 
Here, the parameter a is a measure of homophily of the system. The process is then repeated 
until a mean number of (z) = (g) ~1 links are established for each node (individual) in the 
system. This guarantees that for a ^ —Info and H = 1, only links between nodes with 
small separation are probable and the individuals are connected only to those most similar in 
character, leading to isolated subgroups of nodes. For a = — In 6, links between individuals 
with any social distance are equally probable and a random network results, with the notion 
of similarity between nodes loses its meaning. For intermediate values of a, i.e., a ~ 1, the 
network shows small-world features. Here, we aim at studying the extent of an epidemic as 
a function of a for systems with H = 1 and H = 2, respectively. 

III. EPIDEMIC MODELLING AND RESULTS 

We model epidemics on the social network model by the standard SIR dynamics, which is 
implemented as follows P,!?!- Initially, all nodes are in the S-state and one node is randomly 
chosen for infection. At each timestep, a node i is randomly chosen among all the infected 
(I) nodes. A neighbor (friend) j is then selected randomly among all the neighbors of node 
i, i.e., those with a link connected to i. Note that the effects of dimensionality H is built 
in through the construction of links. If node j is susceptible, it becomes infected and the 
chosen node i remains in the I-state; otherwise (i.e., node j is either I or R) the state of 
node j remains unchanged and the chosen node i becomes recovered (R) at the end of the 
timestep. As time evolves, the number of R-nodes (S-nodes) increases (decreases); while 
the number of I-nodes increases initially and then eventually drops to zero. The number of 
R-nodes at the end of the epidemic is denoted by Nr. 
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We performed extensive numerical simulations to explore the effects of the structural 
parameter a and the dimensionality or number of hierarchies H . For each set of parameters, 
data are collected over 10^-10^ runs, each of which corresponds to a different realization 
of the network and a different initially infected node. We take b = 2, (g) = 10 and hence 
(z) = (g) — 1 = 9. Results for H = 1 are shown in Fig.2. The structural effects on epidemics 
can be studied via the probability distribution function P{N{i) for different values of a 
and different population sizes A^. For a = 10 ^ — In 2, P{Nfj) is independent of and 
peaks at Nji ^ 8, with no network having Nji > 25 (see Fig.2(a)). The independence 
on the population size implies that the spread is local, for cases of large a. This is 
the case of "regular networks" consisting of isolated subgroups in which only local spread 
at the lowest level is possible. Each subgroup can be regarded as a small fully connected 
network. Previous studies showed that about 79% of nodes in a large fully-connected network 
eventually recovered Using this result as an estimation gives 0.79(g) ~ 8 for the size of 
a spread, in good agreement with the peak value observed. As only the mean (g) is fixed, 
each cluster has a different g leading to a distribution P{Nfi) that extends to = 25. 
Simulations using different values of (g) reveal a corresponding shift in at which P{Nii) 
shows a maximum. For the small- world regime corresponding to a ~ 1 (see Fig.2(b)), P(A'^r) 
is again A^-independent and decays exponentially with runs in which the spread can be up 
to Nji ^10^. Geometrically, there exist some links of longer social distance as a decreases, 
leading to larger but still localized spreads. As a decreases towards the random network 
limit of a = —ln2, P{Nfi) takes on the form of two disjoint parts (see Fig.2(c)): (i) one 
that decays exponentially near ^ (inset) characterizing the small fraction of runs with 
small-scale epidemics due to the existence of isolated clusters and (ii) a A^-c/ejoenc/entgaussian 
distribution characterizing global spreading centered at a value Nji that scales linearly with 
A^. Writing Nj^ = a{{g))N, we found a ~ 0.66 for (g) = 10. Extensive simulations reveal 
that a{{g)) approaches 0.79, the value for fully-connected networks, as (g) increases. For 
H = 1, the consequence of an epidemic depends sensitively on the structural parameter a. 

Social networks usually correspond to if > 1. For H > 1, the characteristic features 
of an epidemic is very different from that in H = 1. The most striking result is that for 
if 7^ 1, an epidemic can always spread globally regardless of the degree of homophily (i.e., 
value of a) in the network. Figure 3 shows the results for H = 2. For fixed A^ and (z), 
P{Nfi) becomes independent of a and takes on a form identical to that in Fig. 2(c). The 
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majority of the weight is represented by the gaussian distribution peaked at ^ a{{g))N, 
with a{{g) = 10) ~ 0.66 and increases for larger (g). This result should be contrasted with 
that of searchability on the same social network [19], which occurs for if > 1 and a > 
with increasingly restricted constraints on H and a as increases. For epidemics, H > 1 
alone is sufficient for a global spread of the disease and the result remains valid for different 
values of A^. The independence on a can be understood by the way that the links are 
constructed. Although a larger H implies fewer links per hierarchy, the chance of nodes i 
and j being similar with a shorter Xij in one of the hierarchies is higher and so is the chance 
that a link exists between nodes i and j, even for large values of a. In other words, nodes 
i and j unknown to each other in one hierarchy has a chance of becoming connected when 
additional dimension(s) is (are) added. Thus an increase in H has the effect of promoting 
the spread of a disease in a population by permeating through different connected nodes 
in different hierarchies. This is consistent with daily experience that a flu may spread first 
from a parent to a kid (linked nodes in a family) and then between the kids in school (linked 
nodes in school) and so on. We have also studied the case of H = 3 and obtained similar 
results. 

To further illustrate the qualitative difference between H = 1 and H = 2 networks, we 
show the mean fraction of R-nodes, r = A^~^ Y,Nji=o ^R^i^R)^ a function of a in Fig.4. 
For H = 1, the large a {a ^ 1) regime gives r ~ {g)/N ^ corresponding to the case 
in Fig. 2(a), while the small a {a < Inb) regime gives an A^-independent value of r ^ 0.66 
corresponding to the case in Fig.2(c). The fraction r drops sharply in a small intermediate 
range of a (In 2 < a < 2) corresponding to the small-world regime 12[. This feature is 
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similar to that observed in the spreading of rumors in a small-world construction 
In this regime, links between nodes with large dissimilarity exist and hence N^/N increases 
as a decreases. One would expect that the transition becomes sharper as A^ increases, as 
is restricted only by the connections and is insensitive to A^ in the intermediate regime. 
For H = 2, the behavior is qualitatively different in that r stays at a higher level for all 
values of a and the transition disappears as a result of the effect of the extra hierarchy for 
hooking up individuals in a population. 
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IV. SUMMARY 



In summary, epidemiological processes are studied on a hierarchical social network within 
the SIR model. For population characterized by if > 1 independent hierarchies, global 
spreading results regardless of the structural parameter a and hence the homophily of indi- 
viduals forming a social circle. This result should be contrasted with that of searchability 
problems in that H > 1 alone is sufficient for a global spread of a disease within the SIR 
dynamics. For H = 1, a transition from global to local spread occurs as the population 
becomes decomposed into increasingly homophilous groups when the structural parameter 
a increases. Since social networks usually correspond to ii > 1, it is therefore very difficult 
to confine an epidemic, unless the "dimensionality" of the infected nodes can be reduced 
at least temporarily. One way to achieve this is to quarantine the infected nodes, which is 
known to be an effective way in handling infectious diseases. It will also be interestin g to 
extend the present study to other models of epidemics such as the SIS P] and SIRS 
models. 
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FIGURE CAPTIONS 



Figure 1: Schematic diagram of the groupings of individuals in a hierarchical social network. 
A population of N nodes are classified into b groups. Each group is further divided into 
b subgroups and so on. After {I — 1) divisions, the lowest-level subgroups consist of g 
individuals. There Xij is the social distance between nodes i and j. A population can, in 
general, be characterized by H such hierarchies. 

Figure 2: The probability distribution function P{Nr) of the number of R-nodes at the end 
of an epidemic in a population characterized hj H = 1 hierarchy. The structural parameter 
takes on (a) a = 10, (b) a = 1, (c) a = — In 2. Different population sizes = 1280 (circles), 
2560 (squares), and 5120 (triangles) corresponding to I — 8,9,10 are studied. Each data 
point represents an average over 10^ runs. The inset in (c) shows the portion of P{Nji) for 
small Nr. 

Figure 3: The probability distribution function P{Nr) in a population oi N — 2560 {I — 9) 
characterized by if = 2 hierarchies. The structural parameter takes on a = 10 (circles), 
a — 1 (squares), and a = — ln2 (triangles). Each data point represents an average over 10^ 
runs. P{Nji) signifies a global spread of disease regardless of the structural parameter for 
H>1. 

Figure 4: The fraction of R-nodes at the end of an epidemic as a function of the structural 
parameter a in a population of = 1280 (circles), 2560 (squares), and 5120 (triangles) 
characterized by (a) H = 1 and (h) H = 2 hierarchies. Each data point represents an 
average over 10^ runs. For H > 1, global spreading occurs for all values of a, while a 
transition from global to local spreading occurs for if = 1. 
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